An overview of Difference-in-Difference and Synthetic Control Methods: Classical and Novel Approaches

Society for Epidemiologic Research (SER) Workshop: |
Staggered adoption: Part 4/4

Tarik Benmarhnia https://profiles.ucsd.edu/tarik.benmarhnia (UCSD & Scripps Institute)https://benmarhniaresearch.ucsd.edu/ , Roch Nianogo https://ph.ucla.edu/about/faculty-staff-directory/roch-nianogo (Department of Epidemiology, UCLA Fielding School of Public Health)https://ph.ucla.edu/about/faculty-staff-directory/roch-nianogo
July 15th, 2025

Load data

Let’s load the data

mydata <- read_csv(here("data", "sim_data_hte_staggered.csv"))

In this new dataset, there are 50 states, 15 of which are treated and 35 untreated (controls) The intervention was implemented at different times as below:
- state.name[1:5] (“Alabama”, “Alaska”, “Arizona”, “Arkansas”, “California”) enacted policy in 2000
- state.name[6:10] (“Colorado”, “Connecticut”, “Delaware”, “Florida”, “Georgia”) enacted policy in 2003
- state.name[6:10] (“Hawaii”, “Idaho”, “Illinois”, “Indiana”, “Iowa”) enacted policy in 2006

Visualize the data

mydata <- read_csv(here("data", "sim_data_hte_staggered.csv"))
Rows: 2500 Columns: 10
── Column specification ──────────────────────────────────────────────
Delimiter: ","
chr (1): state
dbl (9): state_num, year, treated, post, treatedpost, y, xit, xt, xi

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
p_load("panelView")
panelview(y ~ treatedpost, data = mydata,
          index = c("state","year"), 
          pre.post = TRUE) 
panelview(y ~ treatedpost, data = mydata, 
          index = c("state","year"), 
          type = "outcome",  
          by.group = TRUE)
Warning in panelview(y ~ treatedpost, data = mydata, index =
c("state", : option "by.cohort" is not allowed with "by.group = TRUE"
or "by.group.side = TRUE". Ignored.
#For all the data
mydata %>% 
  ggplot(aes(x=year, y=y, group=state)) + 
  annotate("rect", fill = "gray", alpha = 0.5,
           xmin = 2000, xmax = Inf,
           ymin = -Inf, ymax = Inf) +
  labs(title = paste("Outcome by year"),
       x = "Year", 
       y = "Outcome",
       color = "Treatment") +
  geom_line(aes(color=factor(treated)), linewidth=0.5) +
  scale_color_discrete(labels=c("Control", "Treated")) +
  geom_vline(xintercept = 2000, lty=2) +
  theme_bw() +
  theme(plot.title = element_text(hjust = 0.5)) 

Analyze the data: Callaway and Sant’Anna

For more information you can read the paper at here The accompanying website also has a nice tutorial here

load package and create new variables

p_load(did)

mydata1 <- mydata %>% 
  mutate(first_treated = case_when( (state %in% state.name[1:5])   ~ 2000,
                                    (state %in% state.name[6:10])  ~ 2003,
                                    (state %in% state.name[11:15]) ~ 2006,
                                    TRUE~0))

paged_table(mydata1)

Group-time effects

group_time_effects <- att_gt( yname  = "y",
                              tname  = "year",
                              idname = "state_num",
                              gname  = "first_treated",
                              xformla = ~ xit,
                              data = mydata1)
Warning in pre_process_did(yname = yname, tname = tname, idname = idname, : Be aware that there are some small groups in your dataset.
  Check groups: 2000,2003,2006.
Warning in att_gt(yname = "y", tname = "year", idname = "state_num",
gname = "first_treated", : Not returning pre-test Wald statistic due
to singular covariance matrix
summary(group_time_effects)

Call:
att_gt(yname = "y", tname = "year", idname = "state_num", gname = "first_treated", 
    xformla = ~xit, data = mydata1)

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 

Group-Time Average Treatment Effects:
 Group Time ATT(g,t) Std. Error [95% Simult.  Conf. Band]  
  2000 1962  -0.2620     0.8269       -2.5129      1.9888  
  2000 1963   0.8596     0.3730       -0.1556      1.8748  
  2000 1964  -2.0749     0.5371       -3.5371     -0.6128 *
  2000 1965   0.9764     0.6995       -0.9275      2.8804  
  2000 1966   0.1867     0.9726       -2.4609      2.8342  
  2000 1967   0.0187     0.4489       -1.2033      1.2406  
  2000 1968  -0.3948     0.4090       -1.5082      0.7186  
  2000 1969   0.6199     0.7058       -1.3011      2.5410  
  2000 1970  -0.4810     0.5940       -2.0979      1.1359  
  2000 1971   0.7035     0.4149       -0.4258      1.8328  
  2000 1972  -0.6007     0.6036       -2.2437      1.0423  
  2000 1973   0.6623     0.6060       -0.9874      2.3120  
  2000 1974  -0.6884     0.9811       -3.3588      1.9820  
  2000 1975   0.4010     0.9593       -2.2101      3.0121  
  2000 1976   0.3134     1.2877       -3.1918      3.8185  
  2000 1977  -0.5069     0.8205       -2.7403      1.7264  
  2000 1978  -0.2897     0.3399       -1.2149      0.6355  
  2000 1979   0.1919     0.3582       -0.7832      1.1670  
  2000 1980   0.9995     0.6259       -0.7041      2.7031  
  2000 1981  -1.9423     0.6615       -3.7428     -0.1418 *
  2000 1982   1.0972     0.4387       -0.0969      2.2913  
  2000 1983   0.5662     0.4763       -0.7302      1.8625  
  2000 1984  -0.9455     0.5456       -2.4307      0.5397  
  2000 1985   0.7564     0.7012       -1.1524      2.6651  
  2000 1986  -0.3502     0.9931       -3.0533      2.3529  
  2000 1987  -0.0923     0.6150       -1.7662      1.5817  
  2000 1988  -0.3408     0.2803       -1.1039      0.4223  
  2000 1989  -0.4171     0.4604       -1.6704      0.8363  
  2000 1990   1.1116     0.7680       -0.9789      3.2021  
  2000 1991   0.3985     1.0880       -2.5630      3.3599  
  2000 1992  -0.6315     0.4790       -1.9353      0.6723  
  2000 1993   0.0359     0.5958       -1.5858      1.6576  
  2000 1994  -0.1197     0.6545       -1.9012      1.6618  
  2000 1995  -0.5594     0.5481       -2.0514      0.9326  
  2000 1996   0.2045     0.3393       -0.7190      1.1279  
  2000 1997   0.6682     0.5214       -0.7512      2.0876  
  2000 1998  -0.3049     0.7798       -2.4277      1.8178  
  2000 1999  -1.0091     0.4417       -2.2115      0.1932  
  2000 2000  49.9829     0.7695       47.8882     52.0775 *
  2000 2001  51.2656     1.4634       47.2822     55.2489 *
  2000 2002  51.5659     0.6990       49.6633     53.4686 *
  2000 2003  52.3739     1.0660       49.4722     55.2757 *
  2000 2004  51.2495     0.9365       48.7003     53.7986 *
  2000 2005  50.6407     0.7161       48.6913     52.5900 *
  2000 2006  51.0882     0.9555       48.4873     53.6891 *
  2000 2007  51.6237     1.1483       48.4979     54.7495 *
  2000 2008  49.6271     0.7467       47.5946     51.6596 *
  2000 2009  52.7165     1.3259       49.1074     56.3256 *
  2000 2010  51.4586     0.7028       49.5457     53.3715 *
  2003 1962   0.0718     0.5709       -1.4822      1.6259  
  2003 1963   1.1817     0.3282        0.2883      2.0752 *
  2003 1964  -1.5577     0.8583       -3.8940      0.7786  
  2003 1965   0.1479     1.0814       -2.7956      3.0914  
  2003 1966   0.5047     1.5451       -3.7009      4.7103  
  2003 1967  -0.6410     1.0631       -3.5348      2.2528  
  2003 1968   1.3154     0.9124       -1.1680      3.7989  
  2003 1969  -0.1430     0.5975       -1.7694      1.4833  
  2003 1970  -0.4499     0.9455       -3.0236      2.1238  
  2003 1971   0.7534     1.2155       -2.5551      4.0618  
  2003 1972  -1.0148     1.3316       -4.6394      2.6098  
  2003 1973  -0.7429     1.2307       -4.0928      2.6069  
  2003 1974   0.2819     0.7972       -1.8881      2.4520  
  2003 1975   0.8578     0.7315       -1.1334      2.8490  
  2003 1976   1.2746     0.7082       -0.6530      3.2022  
  2003 1977  -2.7111     0.9616       -5.3287     -0.0935 *
  2003 1978   0.3345     1.0760       -2.5944      3.2635  
  2003 1979   0.7477     0.9895       -1.9459      3.4412  
  2003 1980   0.6420     0.6143       -1.0300      2.3141  
  2003 1981  -1.5449     1.1014       -4.5429      1.4532  
  2003 1982   1.6706     1.2734       -1.7955      5.1368  
  2003 1983  -0.5434     0.4890       -1.8743      0.7876  
  2003 1984   0.3988     0.6096       -1.2605      2.0581  
  2003 1985   0.3084     0.7628       -1.7678      2.3847  
  2003 1986  -1.0176     0.6640       -2.8250      0.7898  
  2003 1987   0.3036     0.8019       -1.8792      2.4863  
  2003 1988   0.2360     0.5391       -1.2315      1.7034  
  2003 1989   0.1681     0.6549       -1.6146      1.9507  
  2003 1990  -0.8382     0.5657       -2.3781      0.7017  
  2003 1991   1.0345     1.0654       -1.8655      3.9345  
  2003 1992  -1.5943     1.0524       -4.4590      1.2703  
  2003 1993   1.6480     0.7890       -0.4995      3.7956  
  2003 1994  -0.5525     0.5113       -1.9444      0.8394  
  2003 1995   0.7012     0.5531       -0.8045      2.2068  
  2003 1996  -0.9851     0.5083       -2.3686      0.3985  
  2003 1997  -0.6565     0.7520       -2.7035      1.3905  
  2003 1998   0.7634     0.3528       -0.1969      1.7237  
  2003 1999  -1.4632     0.4947       -2.8098     -0.1165 *
  2003 2000   1.9567     0.3703        0.9487      2.9646 *
  2003 2001   0.0256     1.0591       -2.8574      2.9086  
  2003 2002   0.0362     0.9513       -2.5533      2.6256  
  2003 2003  99.0085     0.6146       97.3356    100.6814 *
  2003 2004  99.5673     0.8876       97.1512    101.9834 *
  2003 2005  99.5363     0.5445       98.0541    101.0185 *
  2003 2006  98.7809     0.7361       96.7773    100.7845 *
  2003 2007  99.2121     0.8438       96.9153    101.5088 *
  2003 2008  98.4466     0.9530       95.8526    101.0406 *
  2003 2009 100.8014     0.9509       98.2131    103.3897 *
  2003 2010  99.0418     0.6697       97.2188    100.8647 *
  2006 1962   0.0511     0.2177       -0.5414      0.6436  
  2006 1963   0.6249     0.6308       -1.0921      2.3420  
  2006 1964  -2.7170     0.2802       -3.4798     -1.9542 *
  2006 1965   1.6024     0.2632        0.8861      2.3187 *
  2006 1966   1.0555     0.4946       -0.2908      2.4018  
  2006 1967  -2.4754     0.3925       -3.5438     -1.4070 *
  2006 1968   2.5717     0.3041        1.7441      3.3994 *
  2006 1969  -0.8535     0.4514       -2.0822      0.3751  
  2006 1970  -0.7874     0.2234       -1.3954     -0.1793 *
  2006 1971   1.4289     0.3838        0.3842      2.4736 *
  2006 1972  -1.8847     0.5490       -3.3790     -0.3904 *
  2006 1973   1.4229     0.4569        0.1791      2.6667 *
  2006 1974  -1.3698     0.7321       -3.3626      0.6229  
  2006 1975   1.2419     0.5846       -0.3494      2.8333  
  2006 1976   0.5061     0.5226       -0.9164      1.9286  
  2006 1977  -2.3132     0.6148       -3.9868     -0.6396 *
  2006 1978   1.2382     0.2901        0.4484      2.0279 *
  2006 1979   0.1057     0.7192       -1.8520      2.0633  
  2006 1980   0.0619     1.1893       -3.1754      3.2991  
  2006 1981  -0.8778     0.6247       -2.5783      0.8227  
  2006 1982   1.3548     0.3954        0.2786      2.4311 *
  2006 1983  -0.3826     1.0080       -3.1262      2.3611  
  2006 1984   0.0961     0.6165       -1.5820      1.7742  
  2006 1985   0.8837     0.1912        0.3633      1.4041 *
  2006 1986  -1.0373     0.7014       -2.9464      0.8718  
  2006 1987  -0.5428     0.7310       -2.5327      1.4470  
  2006 1988   0.5303     0.5056       -0.8459      1.9066  
  2006 1989  -0.3930     0.4141       -1.5201      0.7341  
  2006 1990   0.0164     0.4795       -1.2887      1.3216  
  2006 1991   1.2929     0.8039       -0.8953      3.4811  
  2006 1992  -1.4642     0.7989       -3.6389      0.7104  
  2006 1993   1.8158     0.4901        0.4817      3.1499 *
  2006 1994  -3.1738     0.2983       -3.9858     -2.3619 *
  2006 1995   2.2523     0.3320        1.3484      3.1561 *
  2006 1996  -1.1757     0.5193       -2.5892      0.2379  
  2006 1997   0.6384     0.4716       -0.6454      1.9221  
  2006 1998   0.5867     0.5718       -0.9697      2.1431  
  2006 1999  -1.8927     0.4889       -3.2235     -0.5620 *
  2006 2000   0.0785     0.8193       -2.1517      2.3087  
  2006 2001   3.4349     0.7325        1.4412      5.4286 *
  2006 2002  -1.7928     1.1819       -5.0101      1.4244  
  2006 2003   0.9458     0.7738       -1.1606      3.0522  
  2006 2004   0.0706     0.6734       -1.7624      1.9035  
  2006 2005  -1.9094     0.3508       -2.8642     -0.9546 *
  2006 2006 150.5473     0.4430      149.3415    151.7530 *
  2006 2007 150.7607     0.4005      149.6706    151.8507 *
  2006 2008 149.5590     0.6704      147.7342    151.3837 *
  2006 2009 152.6349     0.5497      151.1386    154.1312 *
  2006 2010 150.2910     0.2599      149.5834    150.9985 *
---
Signif. codes: `*' confidence band does not cover 0

Control Group:  Never Treated,  Anticipation Periods:  0
Estimation Method:  Doubly Robust
ggdid(group_time_effects)

Simple Aggregation

agg.simple <- aggte(group_time_effects, type = "simple")
summary(agg.simple)

Call:
aggte(MP = group_time_effects, type = "simple")

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 

     ATT    Std. Error     [ 95%  Conf. Int.]  
 87.9908       10.2595    67.8825    108.0992 *


---
Signif. codes: `*' confidence band does not cover 0

Control Group:  Never Treated,  Anticipation Periods:  0
Estimation Method:  Doubly Robust

Dynamic Effects and (Event Studies): Effect by length of exposure

agg.es <- aggte(group_time_effects, type = "dynamic")
summary(agg.es)

Call:
aggte(MP = group_time_effects, type = "dynamic")

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 


Overall summary of ATT's based on event-study/dynamic aggregation:  
     ATT    Std. Error     [ 95%  Conf. Int.]  
 80.1577         6.444    67.5277     92.7878 *


Dynamic Effects:
 Event time Estimate Std. Error [95% Simult.  Conf. Band]  
        -44   0.0511     0.1915       -0.5054      0.6076  
        -43   0.6249     0.6024       -1.1254      2.3753  
        -42  -2.7170     0.3198       -3.6463     -1.7877 *
        -41   0.8371     0.4315       -0.4167      2.0910  
        -40   1.1186     0.2864        0.2863      1.9509 *
        -39  -2.0165     0.4767       -3.4017     -0.6314 *
        -38   0.8192     0.5364       -0.7395      2.3779  
        -37   0.1703     0.5586       -1.4530      1.7935  
        -36  -1.1678     0.4437       -2.4572      0.1216  
        -35   1.2403     0.3625        0.1869      2.2936 *
        -34  -0.6137     0.4347       -1.8768      0.6494  
        -33   0.3306     0.4590       -1.0031      1.6642  
        -32  -0.3371     0.5337       -1.8878      1.2137  
        -31   0.2823     0.5805       -1.4046      1.9693  
        -30  -0.2393     0.5462       -1.8263      1.3478  
        -29  -0.4426     0.5868       -2.1477      1.2625  
        -28   0.4984     0.3745       -0.5896      1.5865  
        -27   0.6809     0.3981       -0.4758      1.8375  
        -26  -1.1125     0.5865       -2.8167      0.5917  
        -25  -0.0474     0.5423       -1.6231      1.5282  
        -24   0.8053     0.5472       -0.7846      2.3952  
        -23  -0.0825     0.4320       -1.3378      1.1728  
        -22  -0.5795     0.3993       -1.7397      0.5807  
        -21   0.9154     0.4318       -0.3392      2.1700  
        -20  -0.1937     0.4228       -1.4224      1.0349  
        -19  -0.6954     0.4572       -2.0241      0.6332  
        -18   0.6453     0.3331       -0.3225      1.6131  
        -17  -0.2815     0.3048       -1.1672      0.6043  
        -16  -0.2085     0.3501       -1.2259      0.8089  
        -15   0.7617     0.3840       -0.3542      1.8777  
        -14  -0.5488     0.4767       -1.9339      0.8363  
        -13   0.2951     0.4023       -0.8739      1.4641  
        -12  -0.8267     0.5954       -2.5568      0.9034  
        -11   0.0803     0.5421       -1.4949      1.6555  
        -10   0.5280     0.5514       -1.0741      2.1301  
         -9   0.1615     0.3814       -0.9469      1.2698  
         -8   0.2188     0.3325       -0.7475      1.1851  
         -7  -0.9473     0.3252       -1.8923     -0.0023 *
         -6  -0.2325     0.3993       -1.3928      0.9277  
         -5   1.2130     0.5469       -0.3763      2.8022  
         -4  -1.0172     0.4799       -2.4116      0.3772  
         -3   1.1902     0.3545        0.1601      2.2204 *
         -2  -0.0696     0.4471       -1.3686      1.2295  
         -1  -0.9608     0.4410       -2.2422      0.3206  
          0  99.8462    10.0624       70.6075    129.0849 *
          1 100.5312    10.1668       70.9891    130.0733 *
          2 100.2204     9.8409       71.6254    128.8154 *
          3 101.2632    10.2197       71.5675    130.9589 *
          4 100.2508     9.9113       71.4511    129.0505 *
          5  74.5436     7.2627       53.4401     95.6471 *
          6  75.9448     7.8020       53.2741     98.6155 *
          7  75.3328     7.1811       54.4665     96.1990 *
          8  49.6271     0.7163       47.5457     51.7086 *
          9  52.7165     1.2472       49.0925     56.3404 *
         10  51.4586     0.7250       49.3520     53.5652 *
---
Signif. codes: `*' confidence band does not cover 0

Control Group:  Never Treated,  Anticipation Periods:  0
Estimation Method:  Doubly Robust
ggdid(agg.es)

Group-Specific Effects: Effect by group

agg.gs <- aggte(group_time_effects, type = "group")
summary(agg.gs)

Call:
aggte(MP = group_time_effects, type = "group")

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 


Overall summary of ATT's based on group/cohort aggregation:  
      ATT    Std. Error     [ 95%  Conf. Int.]  
 100.4312        0.3068    99.8299    101.0325 *


Group Effects:
 Group Estimate Std. Error [95% Simult.  Conf. Band]  
  2000  51.2357     0.7280       49.7656     52.7058 *
  2003  99.2994     0.5344       98.2203    100.3784 *
  2006 150.7585     0.3110      150.1306    151.3865 *
---
Signif. codes: `*' confidence band does not cover 0

Control Group:  Never Treated,  Anticipation Periods:  0
Estimation Method:  Doubly Robust
ggdid(agg.gs)
# This is the group-specicf effects that can be used to estimate the overall ATT 

Calendar Time effects

agg.ct <- aggte(group_time_effects, type = "calendar")
summary(agg.ct)

Call:
aggte(MP = group_time_effects, type = "calendar")

Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 


Overall summary of ATT's based on calendar time aggregation:  
    ATT    Std. Error     [ 95%  Conf. Int.]  
 80.109        6.5566    67.2583     92.9597 *


Time Effects:
 Time Estimate Std. Error [95% Simult.  Conf. Band]  
 2000  49.9829     0.7327       48.1557     51.8100 *
 2001  51.2656     1.3206       47.9722     54.5590 *
 2002  51.5659     0.6718       49.8905     53.2413 *
 2003  75.6912     7.3123       57.4549     93.9275 *
 2004  75.4084     7.4427       56.8469     93.9699 *
 2005  75.0885     7.4487       56.5122     93.6648 *
 2006 100.1388    10.0195       75.1510    125.1265 *
 2007 100.5322    10.0772       75.4005    125.6639 *
 2008  99.2109    10.0486       74.1505    124.2713 *
 2009 102.0509    10.2358       76.5238    127.5780 *
 2010 100.2638     9.9858       75.3601    125.1675 *
---
Signif. codes: `*' confidence band does not cover 0

Control Group:  Never Treated,  Anticipation Periods:  0
Estimation Method:  Doubly Robust
ggdid(agg.ct)

Analyze the data: Simple DID

Now that we have seen different effects that could be obtained using the method above, let us see what other models give us

p_load("estimatr")
dta <- lm_robust(y ~ treatedpost + factor(year) + xit, 
                 data = mydata,
                 fixed_effects=state,
                 clusters = state, 
                 se_type = "stata")
dta
                   Estimate Std. Error    t value     Pr(>|t|)
treatedpost       87.325018 9.82403093   8.888919 8.636889e-12
factor(year)1962   3.805877 0.19858684  19.164797 2.066841e-24
factor(year)1963   7.702049 0.21733283  35.438958 1.382314e-36
factor(year)1964   8.860991 0.19933799  44.452095 2.898574e-41
factor(year)1965  10.976055 0.22396128  49.008719 2.693782e-43
factor(year)1966  15.497606 0.27232979  56.907495 2.015910e-46
factor(year)1967  16.154480 0.22342449  72.303979 1.854232e-51
factor(year)1968  15.901553 0.26908120  59.095740 3.257094e-47
factor(year)1969  19.998661 0.34226735  58.429941 5.631687e-47
factor(year)1970  24.133909 0.35326767  68.316212 2.909474e-50
factor(year)1971  27.006851 0.35609716  75.841243 1.822107e-52
factor(year)1972  29.743890 0.33258138  89.433420 5.987236e-56
factor(year)1973  30.155062 0.21174316 142.413391 8.227494e-66
factor(year)1974  33.856762 0.29513842 114.714859 3.193713e-61
factor(year)1975  35.994452 0.26531167 135.668562 8.814843e-65
factor(year)1976  40.742812 0.37139011 109.703546 2.826387e-60
factor(year)1977  41.875247 0.34252156 122.255801 1.425881e-62
factor(year)1978  41.121062 0.29930528 137.388358 4.762542e-65
factor(year)1979  45.962363 0.48281198  95.197231 2.854883e-57
factor(year)1980  47.315831 0.40972170 115.482854 2.305765e-61
factor(year)1981  50.078337 0.43008897 116.437158 1.542795e-61
factor(year)1982  52.111395 0.48217789 108.075039 5.864289e-60
factor(year)1983  54.729304 0.34709699 157.677264 5.665087e-68
factor(year)1984  57.379535 0.46182636 124.244823 6.482367e-63
factor(year)1985  62.090587 0.41722703 148.817267 9.579938e-67
factor(year)1986  61.547599 0.44985870 136.815400 5.841771e-65
factor(year)1987  67.224168 0.45006855 149.364285 8.006681e-67
factor(year)1988  68.558156 0.48945996 140.068978 1.852053e-65
factor(year)1989  68.754075 0.54962006 125.093824 4.647928e-63
factor(year)1990  74.676405 0.46897719 159.232488 3.505401e-68
factor(year)1991  77.796638 0.55643866 139.811706 2.026192e-65
factor(year)1992  78.613908 0.43968043 178.797831 1.209772e-70
factor(year)1993  81.722417 0.45718676 178.750619 1.225504e-70
factor(year)1994  83.595372 0.64177162 130.257196 6.442630e-64
factor(year)1995  84.478203 0.57809337 146.132454 2.333106e-66
factor(year)1996  90.220939 0.49104702 183.731771 3.193528e-71
factor(year)1997  89.997076 0.50057615 179.786983 9.235805e-71
factor(year)1998  93.969846 0.55747067 168.564646 2.162213e-69
factor(year)1999  97.431784 0.63275009 153.981463 1.806928e-67
factor(year)2000  92.664698 1.35158026  68.560263 2.447364e-50
factor(year)2001  96.955213 1.38393335  70.057718 8.579439e-51
factor(year)2002  99.069756 1.36924761  72.353426 1.793693e-51
factor(year)2003 100.419616 1.45156752  69.180120 1.581560e-50
factor(year)2004 106.697681 1.42879243  74.676824 3.864251e-52
factor(year)2005 107.893435 1.46166275  73.815547 6.788827e-52
factor(year)2006 115.434307 1.04433231 110.534076 1.955981e-60
factor(year)2007 117.506257 1.01363942 115.925106 1.913226e-61
factor(year)2008 122.246112 1.00710633 121.383521 2.022876e-62
factor(year)2009 127.725705 1.02201556 124.974325 4.870072e-63
factor(year)2010 125.573307 1.01903755 123.227360 9.686637e-63
xit                3.054344 0.03832003  79.706198 1.626184e-53
                   CI Lower   CI Upper DF
treatedpost       67.582888 107.067147 49
factor(year)1962   3.406801   4.204952 49
factor(year)1963   7.265302   8.138796 49
factor(year)1964   8.460406   9.261576 49
factor(year)1965  10.525988  11.426122 49
factor(year)1966  14.950339  16.044874 49
factor(year)1967  15.705491  16.603468 49
factor(year)1968  15.360814  16.442292 49
factor(year)1969  19.310849  20.686473 49
factor(year)1970  23.423991  24.843827 49
factor(year)1971  26.291247  27.722455 49
factor(year)1972  29.075543  30.412237 49
factor(year)1973  29.729548  30.580576 49
factor(year)1974  33.263659  34.449865 49
factor(year)1975  35.461288  36.527616 49
factor(year)1976  39.996476  41.489148 49
factor(year)1977  41.186924  42.563570 49
factor(year)1978  40.519585  41.722538 49
factor(year)1979  44.992116  46.932610 49
factor(year)1980  46.492465  48.139198 49
factor(year)1981  49.214041  50.942633 49
factor(year)1982  51.142422  53.080367 49
factor(year)1983  54.031786  55.426821 49
factor(year)1984  56.451460  58.307609 49
factor(year)1985  61.252138  62.929036 49
factor(year)1986  60.643574  62.451624 49
factor(year)1987  66.319721  68.128615 49
factor(year)1988  67.574549  69.541763 49
factor(year)1989  67.649572  69.858578 49
factor(year)1990  73.733960  75.618850 49
factor(year)1991  76.678433  78.914844 49
factor(year)1992  77.730337  79.497479 49
factor(year)1993  80.803665  82.641168 49
factor(year)1994  82.305683  84.885060 49
factor(year)1995  83.316481  85.639925 49
factor(year)1996  89.234144  91.207735 49
factor(year)1997  88.991131  91.003022 49
factor(year)1998  92.849567  95.090126 49
factor(year)1999  96.160225  98.703343 49
factor(year)2000  89.948596  95.380800 49
factor(year)2001  94.174095  99.736331 49
factor(year)2002  96.318150 101.821362 49
factor(year)2003  97.502582 103.336650 49
factor(year)2004 103.826415 109.568947 49
factor(year)2005 104.956114 110.830756 49
factor(year)2006 113.335642 117.532971 49
factor(year)2007 115.469272 119.543242 49
factor(year)2008 120.222256 124.269968 49
factor(year)2009 125.671888 129.779522 49
factor(year)2010 123.525474 127.621140 49
xit                2.977337   3.131351 49
did <- round(data.frame(ATT     = dta$coefficients["treatedpost"], 
                        se      = dta$std.error["treatedpost"],
                        low_ci  = dta$conf.low["treatedpost"],
                        high_ci = dta$conf.hig["treatedpost"]),2)
did
              ATT   se low_ci high_ci
treatedpost 87.33 9.82  67.58  107.07

This quantity above is not giving us the group-specific overall ATT and could also be biased. It is similar to the simple aggregation when using the did package

Analyze the data: Generalized SCM

y <- gsynth(y ~ treatedpost + xit, 
            data = mydata,  
            EM = F, 
            index = c("state","year"), 
            inference = "parametric", 
            se = TRUE,
            nboots = 100,  #so that it can run faster, default is 200
            r = c(0, 5), 
            CV = TRUE, 
            seed = 123,
            force = "two-way", 
            parallel = FALSE)
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 1.07954*
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 1.09708
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 1.12129
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 1.12952
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 1.15788
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 1.20554

 r* = 0

Simulating errors ....Bootstrapping ...
.
y1 <- round(data.frame(y$est.avg),2)


#Period-specific ATT
paged_table(y$est.att %>% as.data.frame())
#average ATT
y$est.avg
        Estimate      S.E. CI.lower CI.upper p.value
ATT.avg 87.70783 0.1338771 87.44543 87.97022       0
plot(y, type = "counterfactual", raw = "none", main="")
plot(y, type = "counterfactual", raw = "band", main="")
plot(y, type = "counterfactual", raw = "all")
plot(y, type = "ct", raw = "none", main = "", shade.post = FALSE)

This quantity above is not giving us the group-specific overall ATT. It is similar to the simple aggregation when using the did package. Another method is to obtain the overall ATT by estimating the effect in each unit (state) and then pool the estimates as done below

Analyze the data: Pooled effect

Method 1: Estimate single effects and pooled effects using the DID method

## Group 1: "Early adopters", policy enacted in 2000
data_filter_g1 <- mydata %>% 
  filter(state %in% state.name[1:5] | treated==0)

dta_g1 <- lm_robust(y ~ treatedpost + factor(year) + xit, 
                    data = data_filter_g1,
                    fixed_effects=state,
                    clusters = state, 
                    se_type = "stata")

did_g1 <- data.frame(group ="1-Early_adopters",
                     att     = dta_g1$coefficients["treatedpost"], 
                     se      = dta_g1$std.error["treatedpost"],
                     lowerci = dta_g1$conf.low["treatedpost"],
                     upperci = dta_g1$conf.hig["treatedpost"],
                     row.names = NULL)
did_g1
             group      att        se  lowerci  upperci
1 1-Early_adopters 50.31357 0.1348753 50.04076 50.58638
##Group 2: "Medium adopters", policy enacted in 2003

data_filter_g2 <- mydata %>% 
  filter(state %in% state.name[6:10] | treated==0)

dta_g2 <- lm_robust(y ~ treatedpost + factor(year) + xit, 
                    data = data_filter_g2,
                    fixed_effects=state,
                    clusters = state, 
                    se_type = "stata")

did_g2 <- data.frame(group = "2-Mid-adopters",
                     att     = dta_g2$coefficients["treatedpost"], 
                     se      = dta_g2$std.error["treatedpost"],
                     lowerci = dta_g2$conf.low["treatedpost"],
                     upperci = dta_g2$conf.hig["treatedpost"],
                     row.names = NULL)
did_g2
           group      att        se  lowerci  upperci
1 2-Mid-adopters 99.91164 0.1877363 99.53191 100.2914
##Group 3: "ate adopters", policy enacted in 2006


data_filter_g3 <- mydata %>% 
  filter(state %in% state.name[11:15] | treated==0)

dta_g3 <- lm_robust(y ~ treatedpost + factor(year) + xit, 
                    data = data_filter_g3,
                    fixed_effects=state,
                    clusters = state, 
                    se_type = "stata")

did_g3 <- data.frame(group = "3-Late-adopters",
                     att    = dta_g3$coefficients["treatedpost"], 
                     se      = dta_g3$std.error["treatedpost"],
                     lowerci = dta_g3$conf.low["treatedpost"],
                     upperci = dta_g3$conf.hig["treatedpost"],
                     row.names = NULL)
did_g3
            group     att       se  lowerci  upperci
1 3-Late-adopters 150.359 0.179171 149.9966 150.7214
combined <- bind_rows(did_g1, did_g2, did_g3)
combined
             group       att        se   lowerci   upperci
1 1-Early_adopters  50.31357 0.1348753  50.04076  50.58638
2   2-Mid-adopters  99.91164 0.1877363  99.53191 100.29137
3  3-Late-adopters 150.35901 0.1791710 149.99660 150.72142
p_load("metafor")
metaresult <- rma(yi = att, 
                  sei = se, 
                  data = combined, 
                  slab = group, 
                  method = "ML")

combined2 <- combined %>% 
  add_row(group="4-Overall",
          att=as.vector(metaresult$beta),
          se = metaresult$se,
          lowerci = metaresult$ci.lb,
          upperci = metaresult$ci.ub)

combined2
             group       att         se   lowerci   upperci
1 1-Early_adopters  50.31357  0.1348753  50.04076  50.58638
2   2-Mid-adopters  99.91164  0.1877363  99.53191 100.29137
3  3-Late-adopters 150.35901  0.1791710 149.99660 150.72142
4        4-Overall 100.19460 23.5812667  53.97617 146.41303

Method 2: Estimate single effects and pooled effects using the generalized SCM

#Create a function to estimate the effect of each state
gsynth_meta <- function(states, data, nboots = 200){
  
  
  dat <- data %>%
    filter(state=={{states}} | treated == 0)
  
  y <- gsynth(y ~ treatedpost + xit, 
              data = dat,  
              EM = F, 
              index = c("state","year"), 
              inference = "parametric", 
              se = TRUE,
              nboots = nboots, 
              r = c(0, 5), 
              CV = TRUE, 
              seed = 123,
              force = "two-way", 
              parallel = FALSE)
  
  y1 <- data.frame(y$est.avg)
  
  res <- tibble(
    state = states, 
    ATT=y1$Estimate,
    SE=y1$S.E.,
    lowerci=y1$CI.lower,
    upperci=y1$CI.upper)
  
  res
  
  return(res)
}

#test the function
gsynth_meta(states="California",
            nboots = 100,
            data=mydata)
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 1.06914*
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 1.13832
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 1.09489
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 1.06844
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 1.09315
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 1.17677

 r* = 0

Simulating errors ....Bootstrapping ...
.
# A tibble: 1 × 5
  state        ATT    SE lowerci upperci
  <chr>      <dbl> <dbl>   <dbl>   <dbl>
1 California  50.3 0.352    49.6    51.0
#create the list and their intersection
states <- mydata %>% 
  filter(treated == 1) %>% 
  select(state) %>% 
  distinct() %>% 
  as.matrix() %>% 
  as.vector()
states
 [1] "Alabama"     "Alaska"      "Arizona"     "Arkansas"   
 [5] "California"  "Colorado"    "Connecticut" "Delaware"   
 [9] "Florida"     "Georgia"     "Hawaii"      "Idaho"      
[13] "Illinois"    "Indiana"     "Iowa"       
#create list of combinations
list_states <- expand_grid(states) %>% print(n=Inf)
# A tibble: 15 × 1
   states     
   <chr>      
 1 Alabama    
 2 Alaska     
 3 Arizona    
 4 Arkansas   
 5 California 
 6 Colorado   
 7 Connecticut
 8 Delaware   
 9 Florida    
10 Georgia    
11 Hawaii     
12 Idaho      
13 Illinois   
14 Indiana    
15 Iowa       
list_states
# A tibble: 15 × 1
   states     
   <chr>      
 1 Alabama    
 2 Alaska     
 3 Arizona    
 4 Arkansas   
 5 California 
 6 Colorado   
 7 Connecticut
 8 Delaware   
 9 Florida    
10 Georgia    
11 Hawaii     
12 Idaho      
13 Illinois   
14 Indiana    
15 Iowa       
#Loop through all the states
gsynth_res <- pmap_dfr(list_states, gsynth_meta, nboots=2, data =mydata)
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 0.64418*
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 0.68379
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 0.72328
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 0.75253
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 0.77352
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 0.80889

 r* = 0

Simulating errors ...Bootstrapping ...
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 1.26582*
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 1.33597
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 1.41130
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 1.43907
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 1.53695
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 1.53807

 r* = 0

Simulating errors ...Bootstrapping ...
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 0.92242*
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 0.93304
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 0.96230
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 1.00950
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 1.04442
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 1.09204

 r* = 0

Simulating errors ...Bootstrapping ...
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 0.86174*
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 0.91042
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 0.95074
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 1.02461
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 1.04184
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 1.16591

 r* = 0

Simulating errors ...Bootstrapping ...
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 1.06914*
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 1.13832
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 1.09489
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 1.06844
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 1.09315
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 1.17677

 r* = 0

Simulating errors ...Bootstrapping ...
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 1.26357*
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 1.31688
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 1.38237
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 1.34588
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 1.38770
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 1.52314

 r* = 0

Simulating errors ...Bootstrapping ...
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 1.37702*
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 1.41878
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 1.48540
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 1.43492
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 1.49865
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 1.51816

 r* = 0

Simulating errors ...Bootstrapping ...
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 1.00858
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 0.94450*
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 0.96840
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 0.89605*
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 0.93885
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 0.98340

 r* = 3

Simulating errors ...Bootstrapping ...
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 1.23771*
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 1.25700
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 1.29435
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 1.36304
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 1.20627*
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 1.29253

 r* = 4

Simulating errors ...Bootstrapping ...
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 1.44689
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 1.38143*
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 1.41352
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 1.41338
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 1.50633
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 1.49618

 r* = 1

Simulating errors ...Bootstrapping ...
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 0.98163*
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 0.99824
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 0.90518*
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 0.94306
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 0.93514
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 0.97587

 r* = 2

Simulating errors ...Bootstrapping ...
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 1.15811*
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 1.19254
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 1.20397
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 1.26865
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 1.32133
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 1.35986

 r* = 0

Simulating errors ...Bootstrapping ...
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 0.99759
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 0.97693*
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 1.00514
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 1.05921
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 1.08999
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 1.14096

 r* = 1

Simulating errors ...Bootstrapping ...
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 1.18469*
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 1.18814
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 1.23547
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 1.12889*
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 1.18063
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 1.16628

 r* = 3

Simulating errors ...Bootstrapping ...
Cross-validating ... 
 r = 0; sigma2 = 0.99374; IC = 0.35642; PC = 0.94548; MSPE = 0.75178*
 r = 1; sigma2 = 0.95062; IC = 0.66196; PC = 1.14721; MSPE = 0.77123
 r = 2; sigma2 = 0.90559; IC = 0.95480; PC = 1.32515; MSPE = 0.78375
 r = 3; sigma2 = 0.87527; IC = 1.25357; PC = 1.50630; MSPE = 0.80219
 r = 4; sigma2 = 0.84474; IC = 1.54236; PC = 1.67237; MSPE = 0.82338
 r = 5; sigma2 = 0.81465; IC = 1.82186; PC = 1.82457; MSPE = 0.86463

 r* = 0

Simulating errors ...Bootstrapping ...
#used nboots = 2 for simplicity. You should aim for at least 200 bootstrap samples
gsynth_res
# A tibble: 15 × 5
   state         ATT     SE lowerci upperci
   <chr>       <dbl>  <dbl>   <dbl>   <dbl>
 1 Alabama      49.9 0.157     49.6    50.2
 2 Alaska       50.4 0.0863    50.2    50.5
 3 Arizona      50.1 0.123     49.9    50.4
 4 Arkansas     50.8 0.136     50.6    51.1
 5 California   50.3 0.112     50.0    50.5
 6 Colorado    100.  0.163    100.    101. 
 7 Connecticut 100.0 0.164     99.7   100. 
 8 Delaware     99.3 0.285     98.8    99.9
 9 Florida     100.0 0.0171    99.9   100. 
10 Georgia      99.7 0.172     99.4   100. 
11 Hawaii      150.  0.323    150.    151. 
12 Idaho       150.  0.357    149.    151. 
13 Illinois    150.  0.234    150.    151. 
14 Indiana     151.  0.406    150.    152. 
15 Iowa        150.  0.361    149.    151. 
#estimate the overall effect
metaresult2 <- rma(yi = ATT, 
                   sei = SE, 
                   data = gsynth_res, 
                   slab = state, 
                   method = "ML")


#combine the datasets
combined3 <- gsynth_res %>% 
  add_row(state="Overall",
          ATT=as.vector(metaresult2$beta),
          SE = metaresult2$se,
          lowerci = metaresult2$ci.lb,
          upperci = metaresult2$ci.ub)

combined3
# A tibble: 16 × 5
   state         ATT      SE lowerci upperci
   <chr>       <dbl>   <dbl>   <dbl>   <dbl>
 1 Alabama      49.9  0.157     49.6    50.2
 2 Alaska       50.4  0.0863    50.2    50.5
 3 Arizona      50.1  0.123     49.9    50.4
 4 Arkansas     50.8  0.136     50.6    51.1
 5 California   50.3  0.112     50.0    50.5
 6 Colorado    100.   0.163    100.    101. 
 7 Connecticut 100.0  0.164     99.7   100. 
 8 Delaware     99.3  0.285     98.8    99.9
 9 Florida     100.0  0.0171    99.9   100. 
10 Georgia      99.7  0.172     99.4   100. 
11 Hawaii      150.   0.323    150.    151. 
12 Idaho       150.   0.357    149.    151. 
13 Illinois    150.   0.234    150.    151. 
14 Indiana     151.   0.406    150.    152. 
15 Iowa        150.   0.361    149.    151. 
16 Overall     100.  10.5       79.5   121. 
#note that this has better standard errors compared to when pooling the 
#three groups

Method 3: Estimate single effects and pooled effects using the Augmented Synthetic Control Methods

p_load_gh("ebenmichael/augsynth")
set.seed(123)
augsynth.scm <-multisynth(y ~ treatedpost | xit,
                        unit = state, 
                        time = year, 
                        data = mydata, 
                        
                        fixedeff = T,
                        scm = T,
                        time_cohort = F, #can change this to T if interested in time cohorts instead of unit effects
                        progfunc="Ridge")
#progfunc = None for Traditional SCM,
#progfunc = Ridge for Ridge regression or augmented SCM
#progfunc = GSYN for the Generalized SCM

augsynth.scm 

Call:
multisynth(form = y ~ treatedpost | xit, unit = state, time = year, 
    data = mydata, fixedeff = T, scm = T, time_cohort = F, progfunc = "Ridge")

Average ATT Estimate: 100.135
res <- summary(augsynth.scm)
res

Call:
multisynth(form = y ~ treatedpost | xit, unit = state, time = year, 
    data = mydata, fixedeff = T, scm = T, time_cohort = F, progfunc = "Ridge")

Average ATT Estimate (Std. Error): 100.135  (72.068)

Global L2 Imbalance: 0.918
Scaled Global L2 Imbalance: 0.010
Percent improvement from uniform global weights: 99

Individual L2 Imbalance: 12.667
Scaled Individual L2 Imbalance: 0.116
Percent improvement from uniform individual weights: 88.4   

 Time Since Treatment   Level  Estimate Std.Error lower_bound
                    0 Average 100.64440  69.77992  -11.828943
                    1 Average 100.29211  74.15262  -17.224750
                    2 Average  99.69507  72.99860  -16.226749
                    3 Average  99.83579  65.01774   -9.376898
                    4 Average 100.20612  79.52793  -23.752444
 upper_bound
    244.3967
    250.5294
    247.1873
    232.4593
    260.1102
paged_table(res$att)
#is a dataframe that contains all of the point estimates, standard errors, and lower/upper confidence limits. Time = NA denotes the effect averaged across the post treatment periods.

plot(augsynth.scm)
Joining with `by = join_by(Level)`
Warning: The `<scale>` argument of `guides()` cannot be `FALSE`. Use "none"
instead as of ggplot2 3.3.4.
ℹ The deprecated feature was likely used in the augsynth package.
  Please report the issue to the authors.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning
was generated.
Warning: Removed 61 rows containing missing values or values outside the scale
range (`geom_line()`).
Warning: Removed 61 rows containing missing values or values outside the scale
range (`geom_point()`).
plot(augsynth.scm, levels = "Average")
Joining with `by = join_by(Level)`
Warning: Removed 1 row containing missing values or values outside the scale
range (`geom_line()`).
Warning: Removed 1 row containing missing values or values outside the scale
range (`geom_point()`).